-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C/C++ overlay: Add basic Overlay.qll file
#20909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Overlay.qll file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces overlay analysis support for C/C++ by adding a basic Overlay.qll file that defines predicates for handling entity discarding in overlay variants.
Key Changes:
- Adds predicates to distinguish between overlay and base variants of C/C++ analysis
- Implements logic to discard elements from the base variant when files have changed
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| cpp/ql/lib/semmle/code/cpp/internal/Overlay.qll | New file defining overlay-specific predicates including variant detection, location handling, and entity discard logic |
| cpp/ql/lib/cpp.qll | Imports the new Overlay module to integrate overlay functionality into the C/C++ library |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
jketema
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions below.
| var_decls(e, _, _, _, loc) | ||
| or | ||
| // Indirect location (entities) | ||
| exists(@var_decl vd | var_decls(vd, e, _, _, _) | var_decls(vd, _, _, _, loc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's so special about var decls that only those occur here? Is that just because the internal test requires these?
Why is the "indirect" case needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing special about var_decls. It is the only one present because of our internal test. It gives us a way to verify the most basic behavior before expanding coverage later.
The indirect case filters out variable entities that doesn't have direct locations. Their locations come from their @var_decl declarations. If we skip this, queries that use variables without checking locations would still pick up base entities coming from changed files.
If we modify our internal test to a simple query, i.e.,:
import cpp
from Variable v
select v
we would not filter out the variable from the base database unless the indirect case is there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what this all means. I'm generally confused by the terminology you're using here ("indirect"), because that does not match anything I ever encountered.
What problem are you trying to solve by adding
exists(@var_decl vd | var_decls(vd, e, _, _, _) | var_decls(vd, _, _, _, loc))here?
My concern with all of this is, that by adding all these intricacies here, we're going to get something that is non-maintainable/scalable.
| de.existsInBase() and | ||
| overlayChangedFiles(de.getFilePath()) and | ||
| // Only discard if ALL file paths are in changed files | ||
| forall(string path | path = de.getFilePath() | overlayChangedFiles(path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that there are two cases here:
- Elements that have a location in the relation that define them (like variable declarations). In this case I don't think we need the
forall, and we can do what you were doing before. - Elements that do not come with their own location, but where the location can come from one or more other elements (this is probably just variables, functions, and types?).
Do we want to make that distinction more explicit?
| overlay[discard_entity] | ||
| private predicate discardable(@element e) { | ||
| e = any(Discardable d | d.existsInBase() and overlayChangedFiles(d.getFilePath())) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're going to need to do something more complex for C. For example, if you have foo.h:
#define A 1
and bar.h:
#if A
int a;
#elif B
int b;
#endif
then if you change foo.h to
#define B 1
then bar.h's declarations change, but bar.h has not changed.
It gets even more complex when you might have different header variants, e.g. we might have one include or bar.h where A is defined, but a different include elsewhere where B is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we should do here, or should it happen when we detect which files changed?
…n unchanged file Previously, an entity would be discarded if it had any location in a changed file. This caused issues for entities with multiple declaration entries, such as extern variables declared in one file and defined in another. For example, given: // a.c (changed) // b.c (unchanged) extern int x; int x; The variable `x` should be preserved because it has a location in the unchanged file b.c, even though it also has a location in the changed file a.c.
a06c6d9 to
3d69286
Compare
Split the discard predicate into two: one for single-location elements and one for multi-location elements.
1624c09 to
eac06dd
Compare
See internal PR.